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Abstract 

We study estimators for the variance parameter of a stationary process. 
The estimators are based on weighted Cramer-von Mises statistics, and certain 
weightings yield estimators that are “first-order unbiased” for o^. We derive an 
expression for the asymptotic variance of the new estimators; this expression is 
then used to obtain the first-order unbiased estimator having the smallest variance 
among fixed-degree polynomial weighting functions. Although our work is based 
on asymptotic theory, we present exact and empirical examples to demonstrate the 
new estimators’ small-sample robustness. 
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1 Introduction 

Consider a stationary process Yi,Y 2 ,...,Yn with mean ft. Such processes are often en- 
countered in the context of steady-state simulation. For instance, the K’s might represent 
successive customer transit times in a complicated queueing system that has been run to 
steady state. If one is interested in estimating /z, the obvious unbiased pojnt estimator 
is the sample mean A measure of the sample mean’s precision is Var(y'„), which is 
unknown. 
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In this article, we investigate new estimators for Var(K„), or equivalently, for 
<7^ = nVar(K„). A related quantity is also of interest — the variance parameter, 

<7^ = limn-.oo The literature studies many variance estimation methods; e.g., batch 
means, independent replications, spectral analysis, overlapping batch means, regenera- 
tion, autoregressive modeling, and standardized time series (STS) (see [3]). The esti- 
mators presented herein are beised on weighted functionals of standardized time series. 
We shall show that the new estimators are asymptotically unbicised for and that they 
have lower variance than competing estimators. 

We first give some background. The standardized time series is defined as 



TM) 



CTy/n 



for 0<t<l, 



where Yj = j = l,...,n, and is the greatest integer function 

(Schruben [20]). Under mild conditions (see Foley and Goldsman [8], Glynn and Iglehart 
[9], or Schruben [20]), one can show that 

(^/;^(F„-^),<7^„) => (<tW(1),(t5), 



where W is a standard Brownian motion process, B is a standard Brownian bridge 
process on [0, 1], and ^ denotes weak convergence (cis n becomes large) on ZP[0, 1], the 
space of right-continuous functions on [0, 1] having left-hand limits. It is well-known 
that all finite-dimensional joint distributions of B are normal and Cov {B{s),B{t)) = 
min(s,<)(l — max(s,<)), 0 < s,i < 1. Further, one can express B{t) = <W(1) — W(<); so 
it is easy to show that VV(1) and B are independent, and thus y/n{Yn — /^) «Lnd aTn are 
aisymptotically independent. 

The remainder of the paper is organized as follows. §2 reviews the STS weighted area 
estimator for <7^; the weighted area estimator will serve as a benchmark for comparison 
in the subsequent sections. §3 presents new estimators similar to weighted Cramer-von 
Mises (CvM) statistics, and establishes some of their properties. In particular, we find a 
class of CvM estimators that is “first-order unbiased” for cr^; these estimators also have 
lower variance than that of the weighted area estimator. Performance of the estimators 
is studied in §§4 and 5, where we present exact and empirical results, respectively. §6 
summarizes and discusses the results of the previous sections. §7 proposes a number of 
augmentations to the basic CvM estimator and provides conclusions. 
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2 The Weighted Area Estimator 

We start with a discussion of the so-called weighted area estimator for first popularized 
by Schruben [20]. Suppose we define 

n 

and j 

A{f) = f 
Jo 

where /(f) is continuous on [0, 1], not dependent on n, not identically zero, and normalized 
so that Var(/1(/)) = i.e., 

Var(//(()6(0<«) = 2f^jj(s)f(l)s(\-t)dsdl = 1. (1) 

(The above expression can be simplified a great deal; see [12].) Thus, A{f) ~ cTNor(0, 1). 
Goldsman and Schruben [13] (also see [7], [9], and [20]) show that A{f\n) a4(/), 
where denotes convergence in distribution as n becomes large. So by the continuous 
mapping theorem (see Theorem 5.1 of Billingsley [2]), 

Since the limiting random variable A{f) is the weighted area under a Brownian bridge 
process, we refer to A^[f\n) as the weighted area estimator for <r^. 

Before stating the main result on the weighted area estimator, we introduce notation 
that will be useful in the sequel. Define the covariance function Rk = Cov(Fi, Fi+*), 
k = 0,±1,±2, ..., and the quantities 7 = — 253 ^ 1 ^-^fci ^ = /J/(5)ds, and F = 
fd So Further, the notation p(n) = 0 ( 7 (n)) means that |p(n)/ 9 (n)| < K as 

n — > 00 for some constant K, and p(n) = o{q{n)) means that p{n)/q{n) —* 0 as n — 00 . 



The main theorem for the weighted area estimator is as follows. 



Theorem 1 (see [8] and [12]) Under mild conditions on the covairiance function, the 
expected value of the weighted area estimator is 



”)] = <^^ + 



|(F-F)^+F^h 

2n 



-f o(l/n). 



Under an additional uniform integrability assumption (see Billingsley [2]’s Theorem 5.4 
and its preceding comments), the asymptotic variance of the weighted area estimator is 
Var(A=(/)) = Var(.72x?) = 
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Example 1 The expected value of the weighted area estimator with constant weighting 
function fo{t) = \/l2, for all t € (0, 1), is E(i4^(/o; n)] = cr^ + 37 /n 4- o{\/n). □ 

Henceforth, if the bias of an estimator for some parameter is o(l/n), we shall say that 
the estimator is first-order unbiased for that parameter. 

Example 2 If the weighting function f{i) satisfies F = F = 0 (in addition to the 
normalizing condition (1)), then Theorem 1 says that E[.4^(/;n)] = + o(l/n), i.e., 

i4^(/; n) is first-order unbiased for < 7 ^. Examples of weighting functions yielding first-order 
unbiased estimators for are fi{t) = >/840(3F — 3t 4- 1/2) and f{t) = y/Sir i cos{2ir it), 
i = 1,2, .. . (see [8]). □ 

3 The Weighted Cramer- von Mises Estimator 

In this section, we propose several estimators for < 7 ^ based on different Brownian bridge 
functionals. To parallel the discussion of §2, we define 



and 

dg) = r git)(aB{t))Ut, 

Jo 

where g{t) is a weighting function normalized so that E[C( 5 )] = <7^. 
In the sequel, we will require a number of assumptions to hold. 



Assumptions 

1. The process Y\, ... is stationary. 

2. The constants g. and < 7 ^ satisfy =» aW, where 



^n(0 






3. rr=-oo = 0. 

4. rr=i*^i^fei<oo. 

5 . g"{t) is continuous and bounded on [0, 1]. 

6. /q g{t)t{l — t)dt = I (normalizing assumption). 
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Remark 1 Assumptions 1-4 are conditions on the underlying stochastic processes. 
Glynn and Iglehart [9] list various sets of sufficient conditions for Assumption 2 to hold; 
these conditions usually involve moment and mixing conditions. Assumptions 3 and 4 
hold for a wide variety of stochastic processes. Assumptions 5 and 6 are simply conditions 
on the weighting function. □ 

Under the Assumptions, the continuous mapping theorem implies that C{g\n) — ► 
C{g). Notice that the limiting functional C{g) is the weighted area under the square of 
a Brownian bridge; by way of contrast, the weighted area estimator’s limiting functional 
A^{f) is the square of the weighted area under a Brownian bridge. 

The distribution of C{g) with constant weighting function 5o(0 — for all t € [0, l], 
was given by Anderson and Darling (1] and Smirnov [22]. Over sixty years ago, Cramer 
[4] and von Mises [23] studied statistics nearly of the form of C{gQ\n) for the special case 
of independent and identically distributed (i.i.d.) yi,V 2 ,.... Anderson and Darling [1] 
examined the distribution of C{g) with weighting function gAoit) = [^(1 ~ 0]”* (which 
does not quite meet our continuity assumption). However, the distribution of C{g) with 
an arbitrary weighting function has not been explicitly determined; see Durbin [6] for 
additional details. With this historical perspective in mind, we call C{g;n) the weighted 
Cramer-von Mises (CvM) estimator for 

If we observe that 

C{g:n) = -^■£a(,-)kHYl-27„y,+7l), 

” *=1 " 

then we can give cin easy 0{n) algorithm to calculate C{g,n): 



Z, S\y S 2 * 0 

FOR it = 1 TO n 
Z*-Z + Yk 
Si ^ S, + g(,^)kZ 
S2^Si + a(£)Z" 

Z — Z/n 

C(g; n) ^ Z^ g(i)k> - 2ZS, + 5, 

For now, we will be interested in moments of the CvM estimator. Our main theorem 
expresses the expected value of the CvM estimator C(g;n) in terms of its weighting 
function g(t) and the covariance function i?*. In what follows, we define G = /q g(t) dt. 

Theorem 2 Under Assumptions 1 through 6, 

E(C(j;n)l = tr^+^(G-l) + o(l/n). 
n 
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Proof See the appendix. □ 



Assumptions 3 and 4 allow us to derive a useful relationship between <J^ and (cf. 
Schmeiser and Song [19]): 

<t’ = nVar(F„) 

^ 1=1 j=l 

= /i„ + 2£(l--)ft 

i=l ^ 

OO O OO CO 

t=-.oo ^ 1=1 i=n ^ 

= <7^ + - + o(l/n). (2) 

n 

A corollary of the main theorem that gives the analogous expression for the bias of 
C{g\n) as an estimator for follows immediately from Equation (2). 

Corollary 1 Under Assumptions 1 through 6, 

E[C(y;n)l = <7^ + -(G - 2) + o(l/n). 
n 

Some examples illustrate the consequences of Theorem 2 and Corollary 1. The sim- 
plest example assumes a constant weighting function. 

Example 3 Theorem 2 implies that the CvM estimator with constant weighting function 
go{t) = 6 has expected value E[C(po;«)] = + b'^ln + o{lfn) = <jl+4'f/n + o{l/n). □ 

If G = 1 (subject to the constraints of Assumptions 5 and 6), Theorem 2 implies that 
the bias of C{g\n) as 2 m estimator of is o(l/n). In this case, C{g\n) is first-order 
unbiased for < 7 ^. Indeed, it is possible to give such a weighting. 

Example 4 Consider the quadratic weighting function g 2 -c{t) s 51 — c/2 -f ct — 150t^, 
where t € [0, 1] zmd c is real. Theorem 2 implies that E[G(^ 2 ;c; «)] = <7^ + o(l/n). □ 

Similarly, if G = 2 (subject to the constraints of Assumptions 5 and 6), Corollary 1 
implies that C[g\n) is first-order unbiased for cr^. 
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Example 5 Consider the quadratic weighting function g 2 -,c{t) = 42 — c/2 + ct — 120t^, 
where t € [0,1] and c is real. Since G = /o^2;c(0<^^ = 2, Corollary 1 implies that 
^\C{g2-.c>n)]= al + o{\ln). □ 

The choice of weighting function g{t) clearly affects the variances of C{g;n) and 
C{g). (The choice of weighting function f{t) affects the variance of but it does 

not affect the variance of the area estimator’s limiting functional, A^{f), in which case 
Var(i4^(/)) = 2(7'*.) To see how, we shall calculate Var(C(^)). First, we have 

Lemma 1 (Patel and Read (16, p. 309]). If R and S are jointly normal, then 
Cov{R\S^) = 2Cov^(R,5). 

This immediately yields the following theorem on the limiting variance of C{g\n). 

Theorem 3 In addition to Assumptions 1 through 6, suppose that the C^{g\ n)’s (n = 
1,2,...) are uniformly integrable. Then Var(C(^;n)) — ^ Var(C(^)), where 

Var(C(5)) = a”* / f g{$)g{t)Cov{B'^{s),B'^{t))dsdt 
Jo Jo 

= 2a^ f j g{s)g{t)Co\i'^{B{s),B{t))d$dt 
JQ Jo 

(by Lemma 1) 

= 4<t‘* / ^(<)(1— t)^ f g(s)s^ dsdt. 

Jo Jo 

Example 6 Consider the constant weighting function ^o(0 = ® from Example 3. The- 
orem 3 implies that Var(C(po)) = 4<7‘*/5. This limiting variance is significantly smaller 
than that of the area estimator, for which Var(A(/)) = 2a‘* (Theorem 1). □ 

Example 7 Consider the quadratic weighting function P2;e(0 from Example 4; this 
weighting function yields a first-order unbiased estimator for Theorem 3 gives 
Var(C(^ 2 ;c)) = (c^ — 300c-l-26856)a^/2520, a quajitity that is minimized by the weighting 
function 5^2(0 = 52:iso(0> whence Var(C(^J)) = 121a'^/70. This limiting variance is lairger 
than that of the CvM estimator using the constant weighting function ^o(0 (Example 6); 
of course, the estimator for <7^ based on 5o(0 somewhat biased (Example 3). □ 

Example 8 For completeness, consider the quadratic weighting function g 2 ,c{t) from 
Example 5; this weighting function yields a first-order unbiased estimator for <7^. The- 
orem 3 gives Var(C(^ 2 ;c)) = (c^ — 240c -f 18144 )cr^/2520. This variance is minimized by 
the weighting function g^it) = ^2;i2o(Oi which case Var(C( 5 |)) = 52<j‘*/35. Although 
this limiting variance is lower than that of the CvM estimator using ^|(<)» minimum- 
variance first-order unbiased quadratic weighting function (Example 7), it is still larger 
than that of the unweighted estimator using go{i) (Example 6). □ 
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Ideally, we would like to choose a weighting function that minimizes the variance of 
the CvM estimator for <7^ while satisfying the first-order unbiasedness and normalizing 
constraints; i.e., find g{t) that minimizes Var(C(^)) subject to 



G = \ = 

Jo 



( 3 ) 



With this goal in mind, suppose that g{t) can be written as an m-degree polynomial in 
ty i.e., 

m 

<€[0,1], 

1=0 

for coefficients Cq, Cj, . . . , Cm and fixed m. After some algebra, the problem becomes that 
of finding the coefficients that minimize 



Var(C(5„.)) = 



CiCj 



^0 f^o O' + 3) rifc=4(* + i + ^) 



subject to 



^ = 1 and — 



Ci 



= 1 . 



(i -f 2)(z + 3) 

We can use Lagrangian multipliers to solve the above system. Here the Lagrangian 
is given by 

Cj , . . . , 0 , 7 , , Ai , A 2 ) 

■ ‘SS c .anCW, ..I -*• (tarwTT, 



where Ai and A 2 are constants. One takes the m + 3 partial derivatives of £, sets the 
resulting equations to zero, and solves the resulting system of linear equations for the 
m + 3 unknown coefficients. 



Example 9 It is easy to show via the Lagrangian method that the optimal-variance, 
first-order unbiased, quadratic and cubic polynomial weighting function is p$(<) = -24-1- 
150< — 150<^, the choice studied in Example 7. The best quartic turns out to be 

, _ -1310 19270< 25230<2 16120<3 8060<^ 

«,(<) = - 2 r+~ 2 i T ~-^— 3“’ 

in which case Var(C(5j)) = 1.042<r^. We can go further. For example, the polynomicJ 
weighting function of degree m = 6 that minimizes Var(C(^6)) subject to the constraints 
(3) is given by gg(t) = ]2f-o Cii', where the c,’s are as follows. 
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i 


0 


1 


2 


3 


4 


5 


6 


Ci 


-132.9358 


3439.9542 


-26622.7987 


93037.7083 


-163198.9022 


140016.0576 


-46672.0191 



This choice of weights yields the optimal Var(C(^|)) = 0.8093<r^, which is comparable to 
the variance of the unweighted (albeit biased) estimator C{go;n). □ 

Remark 2 In order to achieve further variance savings, we can continue to increase the 
degree of the polynomial weighting function. However, the magnitudes of the resulting 
coefficients become quite large, and one must be careful to avoid round-off error as well 
as deleterious second-order effects for small sample sizes. □ 



4 Some Analytical Examples 

This section presents exact analytical results involving specific stochastic processes. We 
shall first obtain some useful expressions for the expected values and variances of the 
area and CvM estimators. We assume in the sequel that Assumptions 1 through 6 are 
still in effect. 

We begin with an intermediate result on the area estimator. 



E|/t^(/;»)l = Var(/t(/;n)) 

= ^i:f:/(;)/(7)cov(T„(i).r„(i)) 



n 



= ^ L i: /(-)/(f jv'Covo^. - - Y,). 

Ti'^ ^ ^ n n 



(4) 



i=l J=1 



Further, if A{f;n) is normal, then Lemma 1 implies that 



Var(/l’(/;n)) = 2Var^(/l(/;n)) = 2(E[A^(/;n)|)>. 



(5) 



The analogous result on the expected value of the CvM estimator is derived next. 



E[C(j;n)| 



r2 n 



-i:«(-)v>Krn(-)) 
” *=1 ” ” 



( 6 ) 
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In addition to the standing Assumptions, suppose that K] , V2> • • • , are jointly nor- 
mal. Then 

Var(C(s;n)) = 3 E Es(-)s(^)Cov(r.=(-),r,^(i)) 






•=i j=i 



n n 



n 



n 



= K'tl E ?(^)s(^)Cov>(T„(i),r„(^)) 

n n ^ r> n 

(by Lemma 1) 

* ' ' \7 \7 \7 

n n 



4 n— 1 n— 1 



n 



n 



(7) 



= ^ E E - r,, - Y,). 

” .= 1 j=i " " 

We now have at our disposal the machinery to study specific examples in which we 
calculate the exact expected values and variances of A^(/;n) and C{g;n) for various 
weighting functions. 

In particular, for the remainder of this section, we shall work with a first-order moving 
average (MA(1)] process, Vj+i = Oci -I- Cj+i, i = 1,2, . . ., where the Cj’s are i.i.d. Nor(0, 1); 
so i?o = 1 + R±i = and Rk = 0, otherwise. One can derive 



Var(r,) = 



(1+^)^ 26 _ ^ 2 

j j7 j +j2 



and 



Cov(y„n) = ^ + ^ for 



( 8 ) 



(9) 



where <7^ = (1 -\-6Y and 7 = —26. Note that for the MA(1) process. Equation (8) implies 
that cr^ = 2 ^ a result that agrees with Equation (2). 

Example 10 We concentrate here on some area estimator expectation results for the 
MA(1) process. For the consteint weighting function /o(<) = \/l2. Equations (4), (8), 
and (9) show that 



37 



37 



E[AU;«)] = = aU^+o(l/n), 

n n 

as implied by Excimple 1. For the first-order unbiased weighting function f 2 {t) = 
>/840(3<2 - 3t + 1/2), we have (also see [8]) 

<r2(2n« + In* 63n2 - 72) -I- 2l7(2n^ - Sn^ -|- lOn^ + 5n - 12) 



ih;n)] = 



2n6 



= (j'^ + o{\/n), 



which is indeed first-order unbi2ised for < 7 ^ and thus is in accord with Example 2. □ 
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Example 11 We consider CvM expectation results for the MA(1) process, 
constant weighting function ^o(0 = 6, Equations (6), (8), and (9) show that 



E[C(^o;n)] 



_2 , 67 (7^ + 67 , 7 

<7 H 1 — - 

n 

<7^ + — + o(l/n), 
n 



For the 



as implied by Example 3. For the quadratic weighting function ^2;c(0 = 51 — c/2 + ct — 
150t^, we get 



E[C(<72=c;n)] = 



<7^(n‘* 4- 4n^ — 5) 7(24n^ — 29n^ 4- 5) 

4~ r 



n' 






= <7^4- 



4((7^ 4- 67) 






4-0(n-^) 



= cr^4-o(l/n). 



This demonstrates that y2;c(0 yields a first-order unbiased estimator for (independent 
of the choice of c), as implied by Example 4. For the optimal quartic weighting function 
we have (after a great deal of algebra) 



E[Cten)l = + 



655(g^ + 67) 
63n^ 



4-0(n"'*) = cr^4-o(l/n), 



which shows that this weighting function yields a first-order unbiased estimator for < 7 ^, eis 
anticipated by Example 9. Similarly, the optimal sixth-degree weighting function gl{t) 
gives 

22.156(<72 -^67) 



E[C(^6;n)] = <7^4- 






4- o{n = < 7 ^ -f- o(l/n), 



so that this weighting function also produces a first-order unbiased estimator for cr^. 
Notice that the term in the expression for E[C(5|;n)| is quite large compared to 
the terms from the estimators incorporating quadratic or quartic weights; in fact, es 
alluded to bv Remark 2, large values of n are required before the second-order term in 
E(C(i?|; n)j becomes insignificant. 

For completeness, we give results on the quadratic weighting function y2;e(0 = 42 — 
c/2 4- ct — 120t^. In this case, we find that (after some algebra) 



E[C'(y2:c;n)] = 



a^(n* 4- 3n^ - 4) 7(n^ 4- I8n^ - 23n^ 4- 4) 






+ 






3(<7^ 4- 67) 



= i7^ + - + 

n n‘ 

= <7^4-o(l/n). 



4- 0{n ^) 



which demonstrates that this weighting function produces a first-order unbiased estimator 
for < 7 ^ (independent of the choice of c), as implied by Example 5. □ 
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Example 12 Since the MA(1) is a jointly normal process, we can easily derive area 
estimator variance results for it. Equation (5) and Example 10 imply that for weighting 
functions fo{t) = y/l2 and f^it) = \/840(3<^ — 3< + 1/2), we have Var(A^(/;n)) = 
2 <t‘* + o(1). These results make sense in light of Theorem 1. □ 

Example 13 We examine the variance of the CvM estimator for various weighting func- 
tions. For the constant weighting function go(t) = 6, Equation (7) gives us (after tedious 
but straightforward algebra) 

Var(C(go;n)) = ^ + 0(n-^) = ^+o(l), 

as implied by Example 6. For the (variance-optimal and first-order unbiased for cr^) 
quadratic weighting function = “24 4- 150i — 150<^, some algebra yields 

Var(C(5r2i ”)) = 1.7286<7‘‘ H |-0(n~^) = 1.7286cr'* 4- o(l), 

n 

as implied by Examples 7 and 9. For the (variance-optimal and first-order unbiased for 
(7*) quartic weighting function ^^(Oi we can obtain 

Var(C(^ 4 ; n)) = 1.0418<7‘‘ 4- 4- 0(n“^) = 1.0418<7‘‘ 4- o(l), 

n 

as implied by Example 9. 

Finally, for the (variance-optimal and first-order unbiased for cr^) quadratic weighting 
function gl{t) = -18 4- 120< - 120<*, some algebra yields 

Var(C(^2!”)) = 1.4857<7‘* 4- — — 4- 0(n“^) = 1.4857cr'‘ 4- o(l), 

n 

BiS implied by Example 8. □ 

We see from the above examples that the area and CvM estimators behave as adver- 
tised on the simple analytical MA(1) example. The CvM estimator using the Anderson- 
Darling weighting function (which fails to satisfy some of the Assumptions) does not 
behave so nicely. 

Example 14 To complete our series of examples with the MA(1) process, we consider 
the expected value of the Anderson-Darling estimator, i.e., the CvM estimator with 
weighting function gAD{t) = [<(1 — O]”** Then it can be shown that 

E[C(,.,c;n)| = i) 

Tl 71 Y 7Z 2 ft y 

« a2(l-i)-2(i_i_2(£n(n-l)-hc.)) 
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where c* ss 0.577216 is Euler’s constant. Although this estimator is asymptotically 
unbiased for <7^, the convergence rate of the expectation to <7^ is much slower than those 
of the previous examples. □ 

We resort to Monte Carlo simulation in the next section to empirically evaluate the 
performance characteristics of the various estimators on more complicated stochastic 
processes. 



5 Empirical Examples 

In this section, we present empirical examples illustrating the performance characteristics 
of the following variance estimators: 

• A^(/o;n) — unweighted area estimator. 

• — first-order unbiased quadratic area estimator for <7^. 

• C{go’,n) — unweighted CvM estimator. 

• C{gAD\n) — Anderson-Darling estimator. 

• ^( 52 ! — minimum- variance first-order unbiased quadratic CvM estimator for 

• C'( 54 ;n) — minimum- variance first-order unbiased quartic CvM estimator for 

• ”) — minimum- variance first-order unbi«ised sixth-degree CvM estimator for 

(7^. 

• C{g 2 ',n) — minimum- variance first-order unbiased quadratic CvM estimator for 

These examples involve the Monte Carlo simulation of a number of stationary stochas- 
tic processes: 

• The first-order autoregressive process [AR(1)], = i = 1,2,..., where 

the Cj’s are i.i.d. Nor(0, 1 — with — 1 < <^ < 1. 

• The first-order exponential autoregressive process [EAR(l)], Yi+i = <f>Yi -}- c,+i, 
i = 1,2,..., where the c,’s are i.i.d. exponential(l) with probability 1 — ^ and 0 
otherwise, and where 0 < ^ < 1. (See Lewis [14] for more details.) 

• The M/M/1 queueing system’s waiting-time process. 
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Table 1: Estimated Expected Values of Various Variance Estimators — AR(1), 4> = 0.9. 
(Note that = 19.0 for this process.) 



n 


A^(/o;n) 


A-(/2;n) 


C(go-,n) 


C(gAD\ n) 


C{g*2\n) 


C{gl,n) 




C{gV,n) 


<^n 


4 


0.289 

(0.001) 


0.402 

(0.002) 


0.225 

(0.001) 


0.177 

(0.001) 


0.316 

(0.001) 


0.294 

(0.001) 


0.222 

(0.001) 


0.298 

(0.001) 


3.5245 


8 


0.974 

(0.004) 


0.966 

(0.004) 


0.698 

(0.003) 


0.576 

(0.002) 


0.940 

(0.004) 


0.757 

(0.003) 


0.674 

(0.003) 


0.892 

(0.004) 


6.1855 


16 


2.837 

(0.013) 


2.677 

(0.012) 


2.041 

(0.008) 


1.701 

(0.007) 


2.747 

(0.012) 


2.168 

(0.008) 


1.847 

(0.007) 


2.606 

(0.011) 


9.8346 


32 


6.533 

(0.029) 


6.439 

(0.029) 


4.934 

(0.019) 


4.141 

(0.015) 


6.580 

(0.028) 


5.262 

(0.019) 


4.481 

(0.016) 


6.251 

(0.026) 


13.5681 


64 


11.269 

(0.050) 


11.889 

(0.053) 


9.186 

(0.033) 


7.812 

(0.026) 


11.925 

(0.050) 


9.903 

(0.034) 


8.645 

(0.028) 


11.377 

(0.046) 


16.1908 


128 


14.914 

(0.067) 


16.200 

(0.072) 


13.148 

(0.044) 


11.476 

(0.035) 


16.119 

(0.066) 


14.395 

(0.047) 


12.811 

(0.039) 


15.525 

(0.062) 


17.5938 


256 


16.836 

(0.076) 


18.017 

(0.081) 


15.752 

(0.049) 


14.236 

(0.040) 


17.968 

(0.074) 


17.207 

(0.055) 


16.073 

(0.046) 


17.525 

(0.068) 


18.2969 


512 


17.989 

(0.081) 


18.824 

(0.085) 


17.349 

(0.052) 


16.164 

(0.043) 


18.783 

(0.078) 


18.507 

(0.059) 


17.924 

(0.051) 


18.496 

(0.072) 


18.6484 


1024 


18.372 

(0.082) 


18.875 

(0.084) 


18.084 

(0.053) 


17.285 

(0.044) 


18.838 

(0.078) 


18.867 

(0.060) 


18.617 

(0.052) 


18.687 

(0.072) 


18.8242 


2048 


18.799 

(0.084) 


19.064 

(0.085) 


18.599 

(0.053) 


18.073 

(0.045) 


19.027 

(0.079) 


19.091 

(0.061) 


18.932 

(0.054) 


18.941 

(0.073) 


18.9121 



For both the AR(1) and EAR(l) processes, the covariance function Rk = k = 
0, ±1, ±2, The covariance function of the M/M/1 waiting time process is more com- 

plicated (cf. Daley [5]). 

We simulated the above stochastic processes over a variety of parameter values; rep- 
resentative results are presented in Table 1 (AR(1) with <f> = 0.9), Table 2 (EAR(l) with 
4> = 0.9), and Table 3 (M/M/1 waiting time process with arrival rate 0.8 and service rate 
1.0). Each table entry in a row is based on the same 100,000 independent replications 
of the stochastic process. The number in parentheses below an entry is the standard 
error of that entry. Each of the replications was initialized from the appropriate steady- 
state distribution. All uniform [normal] random variates were generated from algorithm 
UNIF [TRPNRM] in Bratley, Fox, and Schrage [3j; exponential deviates used inversion; the 
M/M/1 waiting-time process was generated from an algorithm due to Schmeiser [18]. 
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Table 2: Estimated Expected Values of Various Variance Estimators — EAR(l), <f> = 0.9. 
(Note that = 19.0 for this process.) 



n 


A^fo\n) 


A'Hf2-,n) 


C(jo;n) 


C(gAD',n) 


C(5$;n) 


C{gX\n) 


C{gl,n) 


C(g2,n) 


f 


4 


0.295 

(0.004) 


0.410 

(0.006) 


0.229 

(0.003) 


0.180 

(0.002) 


0.322 

(0.005) 


0.299 

(0.004) 


0.226 

(0.003) 


0.303 

(0.004) 


3.5245 


8 


0.979 

(0.009) 


0.971 

(0.010) 


0.703 

(0.007) 


0.580 

(0.005) 


0.946 

(0.010) 


0.764 

(0.007) 


0.677 

(0.006) 


0.897 

(0.009) 


6.1855 


16 


2.859 

(0.021) 


2.700 

(0.021) 


2.059 

(0.014) 


1.717 

(0.011) 


2.770 

(0.021) 


2.188 

(0.015) 


1.866 

(0.012) 


2.627 

(0.019) 


9.8346 


32 


6.554 

(0.040) 


6.504 

(0.041) 


4.960 

(0.027) 


4.160 

(0.022) 


6.630 

(0.040) 


5.281 

(0.029) 


4.497 

(0.024) 


6.296 

(0.037) 


13.5681 


64 


11.303 

(0.063) 


11.892 

(0.067) 


9.201 

(0.043) 


7.824 

(0.035) 


11.934 

(0.062) 


9.932 

(0.047) 


8.570 

(0.039) 


11.388 

(0.058) 


16.1908 


128 


15.008 

(0.078) 


16.241 

(0.084) 


13.218 

(0.053) 


11.536 

(0.044) 


16.198 

(0.077) 


14.491 

(0.059) 


12.884 

(0.050) 


15.602 

(0.072) 


17.5938 


256 


16.942 

(0.082) 


18.125 

(0.088) 


15.821 

(0.055) 


14.289 

(0.047) 


18.081 

(0.080) 


17.246 

(0.063) 


16.138 

(0.055) 


17.629 

(0.074) 


18.2969 


512 


17.984 

(0.085) 


18.810 

(0.087) 


17.345 

(0.056) 


16.161 

(0.048) 


18.771 

(0.080) 


18.511 

(0.064) 


17.925 

(0.057) 


18.486 

(0.074) 


18.6484 


1024 


18.374 

(0.084) 


18.841 

(0.086) 


18.088 

(0.055) 


17.286 

(0.047) 


18.835 

(0.079) 


18.925 

(0.063) 


18.592 

(0.055) 


18.685 

(0.073) 


18.8242 


2048 


18.719 

(0.085) 


18.955 

(0.086) 


18.552 

(0.054) 


18.029 

(0.046) 


18.985 

(0.079) 


19.039 

(0.062) 


18.849 

(0.055) 


18.898 

(0.073) 


18.9121 
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Table 3: Estimated Expected Values of Various Variance Estimators — M/M/1 Waiting 
Time Process; arrival rate = 0.8, service rate = 1.0 



n 


AHfo\n) 




C(<7o;n) 


C(gAD\n) 


C(92\n) 


C(ff:;r») 


C{96\n) 


C’(j5;n) 


16 


30.9 


28.0 


21.6 


18.0 


29.1 


22.9 


19.5 


27.6 




(0.2) 


(0.2) 


(0.1) 


(0.1) 


(0.2) 


(0.1) 


(0.1) 


(0.2) 


32 


93.9 


85.3 


66.4 


55.5 


89.5 


70.5 


59.7 


85.1 




(0.6) 


(0.5) 


(0.4) 


(0.3) 


(0.5) 


(0.4) 


(0.3) 


(0.5) 


64 


251.7 


235.7 


182.9 


153.2 


245.5 


194.4 


165.2 


235.0 




(1.7) 


(1.6) 


(1.1) 


(0.9) 


(1.6) 


(1.1) 


(0.9) 


(1.5) 


128 


574.0 


562.2 


434.5 


365.7 


577.1 


464.1 


396.1 


542.6 




(4.5) 


(4.4) 


(3.0) 


(2.4) 


(4.4) 


(3.1) 


(2.5) 


(4.0) 


256 


1002.1 


1035.6 


810.4 


690.7 


1047.3 


875.5 


756.2 


1019.0 




(9.1) 


(9.2) 


(6.2) 


(5.0) 


(9.0) 


(6.5) 


(5.3) 


(8.7) 


512 


1415.6 


1541.2 


1228.0 


1065.4 


1532.2 


1327.5 


1180.1 


1493.9 




(14.4) 


(15.4) 


(10.1) 


(8.3) 


(14.6) 


(10.9) 


(9.0) 


(14.6) 


1024 


1708.8 


1827.8 


1563.1 


1398.0 


1824.8 


1710.9 


1569.0 


1769.4 




(18.4) 


(19.3) 


(13.2) 


(11.1) 


(18.0) 


(15.2) 


(12.8) 


(16.3) 


2048 


1831.7 


1919.2 


1749.5 


1616.4 


1922.7 


1879.0 


1803.4 


1921.2 




(17.0) 


(17.5) 


(12.6) 


(11.2) 


(15.8) 


(15.0) 


(14.0) 


(15.7) 


4096 


1914.4 


1986.6 


1867.3 


1767.7 


1979.7 


1958.5 


1918.1 


1954.2 




(14.2) 


(14.7) 


(10.6) 


(9.9) 


(13.0) 


(12.2) 


(12.3) 


(13.2) 


8192 


1970.7 


2016.0 


1940.9 


1870.6 


2008.2 


2009.8 


1980.5 


1944.3 




(12.3) 


(12.4) 


(8.9) 


(8.5) 


(11.1) 


(10.3) 


(10.2) 


(10.3) 



6 Discussion 



This section summarizes and discusses the exact and estimated expectation and variance 
results for the variance estimators examined in §§2-5. Recall that we obtained exact 
results for area estimators in §2 and for CvM estimators in §3. We also gave exact results 
for a specific stochastic process, the MA(l), in §4 and empirical results for the AR(1), 
EAR(l), and M/M/1 in §5. The estimated expected values given in Tables 1-3 are based 
on 100,000 replications; thus, one can obtain estimated variance results from the Monte 
Carlo runs by squaring the estimated standard errors (in parentheses below the estimated 
expected values) and then multiplying by 100,000. 

For each of the stochastic processes under study, the expected value of the unweighted 
area estimator n) converged relatively slowly to as n increased. This phe- 

nomenon is due to the estimator’s comparatively high Bias(A^(/o; n)) « Z^/n (Example 
1; also see [17]). 

The expected value of the unweighted CvM estimator C{go‘, n) converged more slowly 
than that of A^{fo;n) to This makes sense since Bias(C(^o; ”)) ^ 57/n (Example 3) 
is higher than A^(/o;n)’s. On the plus side, we see that for large n, 

Var(C(yo;n))/Var(A^(/o;n)) « ^ = 0.4, 

as suggested by Theorem 1 and Example 6. 

The expected value of the Anderson- Darling CvM estimator C{gAD\ n) converged even 
more slowly than that of C{go',n) to Although we did not prove the inferiority of 

the convergence rate of E[C(^/i£);n)] to for general stationary processes, the evidence 

provided by Example 14 and the empirical work seems to be overwhelming. Some calculus 
shows that ^ 

yar{C{gAD)) = (^ " « 0 . 5797 ( 7 ^ 

D 

(cf. [21]); but it is of little comfort that C{gAO \ has the lowest variance of the estimators 
under study. 

The expected values of the first-order unbiased quadratic area estimator for a^, 
A^{f 2 \n,), and the minimum-variance first-order unbiased quadratic CvM estimator for 
<T^, C{g 2 ;n)y converged comparatively quickly to as n increaised; the rapid convergence 
is a direct consequence of the first-order unbiasedness of the estimators. For large n, we 
see from Examples 12 and 13 and the empirical tables that 

Var(C(^*;n))/Var(A^(/ 2 ;n)) « 121/140, 

as predicted by Theorem 1 and Example 7. 

The minimum-variance first-order unbiaised fourth- and sixth-degree CvM estimators, 
C{g^;n) and C(y|;n), respectively, possess expected values that converge to almost 



(but not quite) as quickly as those of n) and C{g 2 ',n). A favorable property of these 

higher-degree estimators is that they have reduced variances; the variance improvements 
are along the lines described in Example 9. 

Recall that the estimator C(^ 2 ;n) is first-order unbiased for (Example 5). We 
can compare the estimated expected values from Tables 1 and 2 with the corresponding 
actual cr^-values (given in the l<ist column of the tables). We see that the bicis of C{g 2 ', n) 
as an estimator for is about the same as Bias(A^(/ 2 ; n)) and Bias(C( 5 j; n)); this is 
particularly true for large n. In addition, Var(C(^J;n)) is only a little smaller than 
Var(C(pJ; n)). Thus, we do not seem to gain much by using C(^ 2 ;«) to estimate 

The bottom line: Of the estimators studied so far, it appears that Cig^'.n) performs the 
best. 



7 Conclusions 

In this article, we introduced a class of estimators for cr^ = lim„_oo nVar(K„) that are 
similar to Cramer-von Mises statistics. Using appropriate weighting functions, our CvM 
estimators were shown to be first-order unbiased, and asymptotic variance reductions 
of up to 60% (compared to the weighted area estimator) were achievable. Further, the 
variance-optimal weighting functions can be computed independently of the output pro- 
cess. Our conclusions were supported with an analytical example using the MA(1) pro- 
cess. 

Although the estimators are all asymptotically unbiased for er^, finite-sample bias 
can be substantial. Analytical and empirical work showed that the bi<is of the CvM 
estimators converged to zero at leeist as fast as that of the weighted area estimators, and 
the CvM estimators had smaller variance. Thus, if the sample size is sufficiently large, 
the CvM estimators proved to be more efficient than the weighted area estimators. 

As discussed in [10], it is possible to augment the basic CvM variance estimator in a 
number of ways. 

1. One can show that the unweighted CvM and area estimators are highly correlated. 
This suggests that certain linear combinations of the area and CvM estimators will 
give rise to estimators having comparatively lower variance. 

2. All of our work so far has assumed that we have one long batch of n observations. 
An alternative way of organizing the data is to break the n observations into b 
contiguous, nonoverlapping batches, each of size m (eissume n = 6m). This leads to 
another interesting problem — that of examining the consequences of batching the 
data and then forming CvM estimators from each batch. Intuitively, batching of the 
data will tend to increase estimator bias while decrecising estimator variance — of 
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course, one can quantify the trade-off by calculating the batched CvM estimator’s 
mean squared error. (See Schmeiser and Song [19].) 

3. We can also apply the methodology of Meketon and Schmeiser [15] in which the 
n observations are broken into n — m -f 1 overlapping batches, each of size m. 
Although the bias of the resulting overlapped CvM estimator is the same as that of 
the batched CvM estimator, the overlapped estimator’s variance is much smaller. 

The above problems are the subjects of ongoing research and will be the topics of a 
future paper. 
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Appendix 

This appendix contains the proof of Theorem 2. Before proving the theorem, we state 
and prove a series of lemmeis. First, we define the cumulative sums Z* = Fj and 
the variance time curve V{k) = Var(Zfc), k = 1, 2, . . . , n (see [11]). Since g{t) is aissumed 
to be continuous and bounded on [0, 1], we denote M = supo<t<i |5(0I < 

Lemma 2 Under the Assumptions of §3, 

V{n) = ncr^ -f 7 — 2^(n - t)/?,- = ntr^ -f 7 -f o(l). 

t=n 

Proof Follows from the arguments leading to Equation (2). □ 



Lemma 3 (The discrete-time version of a result given in [11]-) Under the Assumptions 
of §3, if n > A:, we have 

2Cov(Z„,Z*) = V{n) + V{k)-V{n-k). 

Proof By stationary increments, 

U(n) = Var(Z(n) - Z(fc) -f Z(it)) 

= Var(Z(n) - Z(Jfc)) -|- Var(Z(Jt)) + 2Cov(Z(n) - Z(Ar), Z(Jb)) 

= Var(Z(n-Jt))-f U(it)-|-2Cov(Z(n),Z(Jt))-2U(lk). □ 
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Lemma 4 Under the Assumptions of §3, 




< = 0(n=*). 

*=i 



Lemma 5 Under the Assumptions of §3, 



” k 



k=\ 



n 



EU - n)Rj + EU - >‘)Ri - E -in- k))R, 

j=k j=n— ik 



j=n 









n—l 



n— 1 



3Eil^.l + Eil^il+ E jW 



= M 
= o(n^). 

Lemma 6 Under the Assumptions of §3, 



j=k j^n^k 

|n(n + 1) + n^ 

^ j=n k=\ 



Ep(-)E 0‘ -^)^j 

k=i ” :^k 

< ME:Em,\ 

Jfe=l i=k 

= M^P|i?fe| + Mn j\Rj\ 

k=l j=n+l 

= o(n). 

Lemma 7 Under the Assumptions of §3, 

+ 7 + = {na'^ + 'r)f29i^)k^ -koin^)- 

k=i ” k=i ” ik=i ” 

Proof Follows from Lemmas 2 and 4. □ 
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Lemma 8 Under the Assumptions of §3, 

f;5(-)Kov(Z„,Z0 



ib=l 



n 



= Es(-)‘- 



k=\ 



n 



k(7^ + ^ + Y,U - + £(; - k)Rj - ^ (i - (n - k))Rj 



j=n 

” it 



j=k 



}=n—k 






n 



fc=i “ “ fc=i 

Proof Follows from Lemmas 2, 3, and 5. □ 



Lemma 9 Under the Assumptions of §3, 



tsi^nk) = tsA 



k=l 



n 



Ar=l 



n 



kc^ + T + 2 - k)R, 

= + + o(n). 



^ k 
n ' 



ib=i 

Proof Follows from Lemmas 2 and 6. □ 



ik=i 



n 



Lemma 10 (Trapezoid Rule). If e"{t) is continuous and bounded V< 6 [0, 1], then 

p 1 



L 



, 1 «(0)“ c(l) ^ 

e{s)ds = -^c(-) + -i — - + o{l/n). 






2n 



We can finally prove Theorem 2. 



Proof (of Theorem 2). 

E[lV'(n)| = 1^5(i)(;2E|(F„-F*n 

r) * n 



”^1 « 



= ^ E - 1 1 «(;)Ko.(Z., ZO + ;!5 E 9(^)V(k) 



jrr ” 









1 1"t l"t*l fl”i* 1”^ l"itl 1 



n~i n n%Tr « 

(by Lemmas 7, 8, and 9, and algebra) 



= o* 



/ y(0(^ -t^)dt + - f g{t){t^ -t + l)dt + o{\/n) 
Jo n Jo 



(by Lemma 10). 
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Application of Assuniption 6 completes the proof. □ 



Remark 3 It is quite a bit easier to prove the continuous-time version of the theorem 
(cf. [11]). 
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